Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Ensure local snapshot staging directory exists for restore #2449

Merged
merged 3 commits into from
Dec 23, 2024

Conversation

pcholakov
Copy link
Contributor

Discovered a regression in the snapshot restore path which prevents it from working currently.

This also adds a basic snapshot roundtrip smoke test.

@pcholakov
Copy link
Contributor Author

@muhamadazmy this is a pretty dumb bug; the fix is just ensuring that the pp-snapshots directory exists.

@jackkleeman adding you as an optional reviewer in case you might have a suggestion for improving the e2e test - feel free to ignore!

Copy link

github-actions bot commented Dec 20, 2024

Test Results

  7 files  ±0    7 suites  ±0   4m 22s ⏱️ - 1m 51s
 47 tests ±0   46 ✅ ±0  1 💤 ±0  0 ❌ ±0 
182 runs  ±0  179 ✅ ±0  3 💤 ±0  0 ❌ ±0 

Results for commit 2edaebf. ± Comparison against base commit 5ed6c1a.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@muhamadazmy muhamadazmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @pcholakov for fixing this issue. I left couple of questions below.

use restate_types::config::{LogFormat, MetadataStoreClient};
use restate_types::net::AdvertisedAddress;
use restate_types::protobuf::cluster::node_state::State;
use restate_types::protobuf::cluster::{NodeState, RunMode};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running cargo clippy --all-targets --workspace -- -D warnings usually helps with this kind of CI errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally do just lint but point taken - in this case, I did not :-)

@@ -335,6 +335,10 @@ impl StorageOptions {
pub fn data_dir(&self) -> PathBuf {
super::data_dir("db")
}

pub fn snapshots_staging_dir(&self) -> PathBuf {
super::data_dir("pp-snapshots")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that super::data_dir is feature gated by #[cfg(any(test, feature = "test-util"))] won't this mean it should not be used outside of tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't noticed that, just copied what we already do in data_dir(..) just above. Doesn't seem to hurt - but I'm somewhat surprised that this works. And we already use that and node_dir() from production code elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i am surprised that it's working to!

Comment on lines +408 to +411
if !self.staging_dir.exists() {
std::fs::create_dir_all(&self.staging_dir)?;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this make more sense if moved to create_if_configured ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately put it closer to the point where we need to use it; a couple of reasons:

  • we won't create it if a snapshots destination is configured, but the node only ever produces snapshots, and never downloads any
  • we will (re-)create it just-in-time, in case someone or something helpfully deletes the empty staging directory created on startup

@pcholakov pcholakov merged commit 25a89c0 into main Dec 23, 2024
14 checks passed
@pcholakov pcholakov deleted the snapshots/fix-restore branch December 23, 2024 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants